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L7: Entry 9 of 10 



File: USPT 



Dec 12, 1995 



DOCUMENT- IDENTIFIER: US 5475823 A 

TITLE: Memory processor that prevents errors when load instructions are moved in the 
execution sequence 

Abstract Text (1) : 

A memory processor which prevents errors when the compiler advances long latency 
load instructions in the instruction sequence to reduce the loss of efficiency 
resulting from the latency time. The memory processor intercepts all load and store 
instructions prior to the instructions entering the memory pipeline . The memory 
processor stores load instructions for a period of time sufficient to determine if 
any subsequent store instruction that would have been executed prior to the load 
instruction, had the load instruction not been moved, references the same address as 
that specified in the load instruction. If a store instruction references the load 
instruction address, the invention returns the same data as the load instruction 
would have if it was not moved by the compiler. 

Detailed Description Text (2) : 

The present invention effectively separates the time when the address (and other 
parameters) of a load instruction is presented to a memory and the time at which the 
load instruction effectively samples the state of the memory. In prior art systems, 
the time that the address of a load instruction is presented to the memory is also 
the time that the load effectively samples the state of the memory. This is also the 
time that the load is said to be "executed". Even in pipelined memory systems in 
which load instructions physically sample the state of the memory a number of cycles 
after the load instruction is issued, the memory is effectively sampled at the 
instant that the load is issued because the data returned by the load captures all 
those memory state modification perpetrated by operations issued before the load. 
This is true even if these operations have not physically updated the memory by the 
time the load is issued. 



Detailed Description Text (5) : 

When used in conjunction with the present invention, a "watch-window" is defined for 
each load instruction that is moved by the compiler. In one embodiment of the 
present invention, the compiler stores a count in each long latency load instruction 
that indicates the number of instructions over which it was moved with respect to 
the code ordering implied in the original program. The present invention detects 
such load instructions as they enter the memory pipeline and stores information 
specifying the load instruction and the number of instructions over which it was 
moved. Denote the number of instructions over which the long latency load 
instruction was moved by N. On each of the following N instruction cycles, the 
present invention examines the instructions entering the memory pipeline to 
determine if the instruction in question is a store instruction referencing the same 
memory location as that specified in the load instruction. If no such store 
instruction is detected, the load instruction in question will return valid data and 
no action need be taken. If, however, a store instruction referencing the memory 
location in question was detected during the N instruction cycles, the present 
invention causes the long latency load instruction to be re -executed at the location 
in the code sequence at which it would have been executed without the move. During 
this re-execution, the present invention signals the CPU to suspend operations for 
the latency time of the load instruction in question. 



Detailed Description Text (17) : 

The embodiments of the present invention described above cause a load instruction 
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whose address has been used in a store instruction during the watch window to be 
re-issued at the point in the code sequence that the load instruction would have 
been issued had the load instruction not been moved by the compiler. An alternative 
mechanism for dealing with this situation is to provide a data forwarding system. A 
block diagram of memory processor 510 according to the present invention which 
utilizes such a data forwarder is illustrated in FIGS. 5 and 6. As was the case with 
the previously described embodiments of the present invention, memory processor 510 
detects load and store instructions communicated by a CPU 516 to a memory 512. The 
instructions are detected by instruction detector 520 which recognizes load 
instructions that have been advanced in the instruction sequence by the compiler. 
Information specifying the load instruction is logged in a register file 522 as 
described above with reference to memory processor 10. Information specifying the 
load instruction addresses is also logged in a data forwarding circuit 550. The load 
instruction proceeds to query memory 512 and the corresponding memory data is 
returned to data forwarder 550 at the end of the latency period. The data may also 
be returned directly to the register file in CPU 516. Address comparator 528 
compares the addresses of all store instructions with the addresses of the load 
instructions stored in register file 522 to check for partial or complete overlap of 
the memory locations accessed by the load and store instructions. If a store 
instruction overwrites part or all of the data accessed by the load instruction, 
controller 524 sets the flag corresponding to the load instruction in question. 
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